Casting the Dice with FXSL: Random Number Generation Functions in XSLT

Dimitre Novatchev, March - April 2002

This article is a follow-up from two recent publications [1], [2] on functional programming in XSLT. Described is the Random module of the XSLT functional programming library FXSL [3], which implements a linear congruential method of generating a pseudo-random sequence of natural numbers starting with a seed and then getting the next element of the sequence, as described by Simon Thompson [4]. Such a sequence can then be scaled to natural or real numbers in any given interval. Numbers can be also generated, denoting a given event outcome with a specified probability (distribution). Implemented is a simplified Monte Carlo integration to test the "randomness" of the functions' results. The code of the implementation demonstrates the use of such powerful, functional programming design patterns as partial application and creation of functions dynamically.

The source code for this article is part of the FXSL library version 1.1 and can be downloaded here:

Using MSXML? Get the MSXML source code!

Using SAXON? Get the SAXON source code!

Using XALAN? Get the XALAN source code as adapted by Knut Wannheden !

Introduction

As the XSLT language and its usage evolve and mature over time, there have been repeated questions about the availability of random numbers generation in the language or how to implement it. The prevailing answer has been that one is to write their own extension functions, reflecting the lack of any standard random generation functions in XPath 1.0 [5]. Such a feature is also missing in the published "XQuery 1.0 and XPath 2.0 Functions and Operators" Working Draft [6].

This article has two goals:

To provide a set of functions for a majority of random numbers related tasks as part of the FXSL XSLT functional programming (FP) library.
To demonstrate how the implementation is taking advantage of some very powerful FP design patterns, such as partial application and creation of functions dynamically.

1. Generation of a single random number and a sequence of random numbers. Scaling a sequence.

In this article we follow the approach to random numbers generation described by Simon Thompson in [4]. While it may not be realistic to aim at producing a truly random number sequence, we'll implement generation of a pseudo-random sequence of natural numbers, using a linear congruential method. This method works in the following way: we start with seed and then get the next element of the sequence from the previous one using the following definition:

nextRand n = (n*multiplier + increment) `mod` modulus (1)

In our implementation we'll be using the following constants:

seed = 17489

multiplier = 25173

increment = 13849

modulus = 65536

From (1) and using the Haskell function iterate, the following will produce an infinite sequence of random numbers ranging from 0 to modulus - 1:

randomSequence :: Int -> [Int]

randomSequence = iterate nextRand

where the function iterate is defined in Haskell's Prelude in the following way:

iterate :: (a -> a) -> a -> [a]

iterate f x = x : iterate f (f x)

so it creates an infinite list with head the argument x and each consecutive element is the result of applying the function f to the previous element.

XSLT is a strict language (does not support lazy evaluation), therefore we need a finite analog of the iterate function:

scanIter :: (Ord a, Num a) => a -> (b -> b) -> b -> [b]

scanIter n f x

| n == 0 = [x]

| n > 0 = result ++ [f(last result)]

where result = scanIter (n-1) f x

As defined above, scanIter takes a number n, a function f and an argument (for f) x, and produces a list of n+1 elements, the first of which is x, and each next element is the result of applying f on the current element.

Now, we can produce a sequence of n + 1 random numbers in the following way:

randomSequence :: Int -> [Int]

randomSequence = scanIter n nextRand

If we want the numbers to come in the integer range s to t inclusive, we need to scale the sequence:

scaleSequence :: Int -> Int -> [Int] -> [Int]

scaleSequence s t

= map scale

where

scale n = n `div` denom + s

denom = modulus `div` range

range = t - s + 1

Note that we are using partial application in the definition of both randomSequence and scaleSequence. In our XSLT implementation of these functions partial application is produced by using the curry function, described in detail in [2].

Bellow is the XSLT implementation of nextRand, scanIter, scaleSequence and randomSequence.

randNext

<!--
===========================================================
 Stylesheet: random.xsl           Randomization Functions  
    Version: 1.0 (2002-03-16)                              
   Based on: Random number algorithms in Simon Thompson's  
             "Haskell, the craft of functional programming"
===========================================================-->
<xsl:stylesheet version="1.0" 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:vendor="urn:schemas-microsoft-com:xslt"
 xmlns:randScale="f:randScale"
 xmlns:myRandNext="f:myRandNext" 
 xmlns:mySingleRandDistFun="f:mySingleRandDistFun" 
 xmlns:x="f:fxslRandom.xsl"
 exclude-result-prefixes="xsl vendor randScale 
 myRandNext mySingleRandDistFun x"
>

<!--
  ========================================================
    Imported files:                                       
  ========================================================-->
  <xsl:import href="map.xsl"/>
  <xsl:import href="curry.xsl"/>
  <xsl:import href="iter.xsl"/>
  
<!--
  ========================================================
    Global Randomizing constants:                         
  ========================================================-->
  <xsl:variable name="seed"       select="17489"/>
  <xsl:variable name="multiplier" select="25173"/>
  <xsl:variable name="increment"  select="13849"/>
  <xsl:variable name="modulus"    select="65536"/>
  
<!--
  This is a linear congruental method of generating       
  a pseudo-random sequence of natural numbers             
  starting with a seed and then getting the next element  
  of the sequence from the previous value like this:      
                                                          
  nextRand :: Int -> Int                                  
  nextRand n = (multiplier * n + increment) `mod` modulus 
-->  
  
<!--
      Template: randNext                                   
      Purpose : Return the value of the next random number 
                from the current                           
    Parameters:                                            
         $arg1  - the current random number,               
                  from which to produce the next           
  ======================================================== -->
  <xsl:template name="randNext" match="myRandNext:*">
    <xsl:param name="arg1"/>
    
    <xsl:value-of select="($multiplier * $arg1 + $increment) 
                         mod 
                           $modulus"/>
 </xsl:template
>

scanIter:

<!--
    Template: scanIter                                     
     Purpose: Iterate (compose a function with itself)     
              N times and produce a list, whose elements   
              are the results from the partial iterations  
  Parameters:-                                             
    $arg1   - the number of times to iterate               
    $arg2   - a template reference to the function         
              that's to be iterated                        
    $arg3   - an initial argument to the function          
    $arg4   - [optional] name for the elements             
              in the result node-set                       
  =========================================================-->
  <xsl:template name="scanIter">
    <xsl:param name="arg1"/>               <!-- n -->
    <xsl:param name="arg2" select="/.."/>  <!-- f -->
    <xsl:param name="arg3"/>               <!-- x -->
    <xsl:param name="arg4" select="'el'"/> <!-- elName -->
    
    <xsl:choose>
     <xsl:when test="$arg1 &lt; 0">
       <xsl:message>
        [scanIter]Error: Negative value for n.
       </xsl:message>
     </xsl:when>
     <xsl:when test="$arg1 = 0">
       <xsl:element name="{$arg4}">
         <xsl:value-of select="$arg3"/>
       </xsl:element>
     </xsl:when>
     <xsl:otherwise>
       <xsl:variable name="vrtfIterMinus">
         <xsl:call-template name="scanIter">
           <xsl:with-param name="arg1" select="$arg1 - 1"/>
           <xsl:with-param name="arg2" select="$arg2"/>
           <xsl:with-param name="arg3" select="$arg3"/>
           <xsl:with-param name="arg4" select="$arg4"/>
         </xsl:call-template>
       </xsl:variable>
       
       <xsl:variable name="vIterMinus" 
            select="vendor:node-set($vrtfIterMinus)/*"/>
       
       <xsl:variable name="vrtfLastIter">
         <xsl:element name="{$arg4}">
           <xsl:apply-templates select="$arg2">
             <xsl:with-param name="arg1" 
                             select="$vIterMinus[last()]"/>
           </xsl:apply-templates>
         </xsl:element>
       </xsl:variable>
       
         <xsl:copy-of select="$vIterMinus"/>
         <xsl:copy-of 
              select="vendor:node-set($vrtfLastIter)"/>
     </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

scaleSequence, randomSequence, randScale:

<!--
      Template: scaleSequence                              
      Purpose : Return a scaled version of a sequence      
                from a given sequence                      
    Parameters:                                            
       $pStart  - [optional] the start of the interval,    
                   in which randoms                        
                   are to be produced                      
       $pEnd    - [optional] the end of the interval,      
                   in which randoms                        
                   are to be produced                      
       $pList   - the list of numbers that are to be scaled
  =========================================================-->
  <xsl:template name="scaleSequence">
    <xsl:param name="pStart" select="0"/>
    <xsl:param name="pEnd" select="1"/>
    <xsl:param name="pList" select="/.."/>
    
    <xsl:variable name="vScaleFun" 
                  select="$x:st/randScale:*[1]"/>
    <xsl:variable name="vRange" select="$pEnd - $pStart + 1"/>
    
    <xsl:variable name="vrtfCurriedScale">
      <xsl:call-template name="curry">
        <xsl:with-param name="pFun" select="$vScaleFun"/>
        <xsl:with-param name="pNargs" select="3"/>
        <xsl:with-param name="arg2" select="$pStart"/>
        <xsl:with-param name="arg3" select="$vRange"/>
      </xsl:call-template>
    </xsl:variable>
    
    <xsl:variable name="vFixedScaleFun" 
         select="vendor:node-set($vrtfCurriedScale)/*"/>
                  
    <xsl:call-template name="map">
      <xsl:with-param name="pFun" select="$vFixedScaleFun"/>
      <xsl:with-param name="pList1" select="$pList"/>
    </xsl:call-template>
  </xsl:template>
  
  
<!--
      Template: randomSequence                             
      Purpose : Return a sequence of random numbers        
                starting from a seed                       
    Parameters:                                            
      $pLength  - [optional] the length of the sequence    
                   of randoms that must be produced        
      $pSeed    - [optional] the seed for the randomization
      $pStart   - [optional] the start of the interval,    
                  in which randoms are to be produced      
      $pEnd     - [optional] the end of the interval,      
                  in which randoms are to be produced      
  =========================================================-->
  <xsl:template name="randomSequence">
    <xsl:param name="pLength" select="1"/>
    <xsl:param name="pSeed" select="$seed"/>
    <xsl:param name="pStart" select="0"/>
    <xsl:param name="pEnd" select="$modulus - 1"/>
    
    <xsl:variable name="vFunRandNext" 
                  select="$x:st/myRandNext:*[1]"/>
    
    <xsl:variable name="vrtfUnscaledSeq">
      <xsl:call-template name="scanIter">
        <xsl:with-param name="arg1" 
                        select="$pLength - 1"/><!-- n -->
        <xsl:with-param name="arg2" 
                        select="$vFunRandNext"/><!-- f -->
        <xsl:with-param name="arg3" 
                        select="$pSeed"/> <!-- x -->
      </xsl:call-template>
    </xsl:variable>
    
    <xsl:choose>
      <xsl:when test="$pStart = 0 
                     and $pEnd = $modulus - 1">
        <xsl:copy-of 
           select="vendor:node-set($vrtfUnscaledSeq)/*"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:call-template name="scaleSequence">
          <xsl:with-param name="pStart" select="$pStart"/>
          <xsl:with-param name="pEnd" select="$pEnd"/>
          <xsl:with-param name="pList" 
            select="vendor:node-set($vrtfUnscaledSeq)/*"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

<!--   The following  scales a single number n 
       from the interval [0, modulus - 1]      
       to the interval [s, t]                  
                                               
       scale n = n `div` denom + s             
       range   = t - s + 1                     
       denom   = modulus `div` range           

-->  
  <xsl:template match="randScale:*">
    <xsl:param name="arg1"/> <!-- n -->
    <xsl:param name="arg2"/> <!-- s -->
    <xsl:param name="arg3"/> <!-- range -->
    
    <xsl:variable name="vDenom" 
                  select="floor($modulus div $arg3)"/>
    
    <xsl:value-of select="floor($arg1 div $vDenom) 
                                + $arg2"/>
  
  </xsl:template>

Let's test these functions now:

test1-random.xsl

<xsl:stylesheet version="1.0" 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:vendor="urn:schemas-microsoft-com:xslt"
 exclude-result-prefixes="xsl vendor"
>

  <xsl:import href="random.xsl"/>
  
  <!-- To be run on any input xml file -->
  
  <xsl:output indent="yes" omit-xml-declaration="yes"/>
  
  <xsl:variable name="NL" select="'&#xA;'"/>
  
  <xsl:template match="/">
    
    <xsl:variable name="vrtfUnscaledRandoms">
    <xsl:call-template name="randomSequence">
      <xsl:with-param name="pLength" select="100"/>
    </xsl:call-template>
    </xsl:variable>
    
    <xsl:variable name="vUnscaledRandoms" 
        select="vendor:node-set($vrtfUnscaledRandoms)/*"/>
    
100 Randoms in (0, <xsl:value-of select="$modulus - 1"/>):
    <xsl:for-each select="$vUnscaledRandoms
                             [position() mod 8 = 1]">
      <xsl:value-of select="$NL"/>
      <xsl:for-each select=". 
                       | following::*[position() &lt; 8]">
        <xsl:value-of select="concat(., ', ')"/>
      </xsl:for-each>
    </xsl:for-each>
    
    <xsl:variable name="vStart" select="1"/>
    <xsl:variable name="vEnd" select="10"/>
    
    <xsl:value-of 
         select="concat($NL,$NL,
                        '100 Randoms in (',
                        $vStart, ', ',
                        $vEnd, '):',
                        $NL
                        )"/> 
    <xsl:variable name="vrtfScaledRandoms">
    <xsl:call-template name="randomSequence">
      <xsl:with-param name="pLength" 
                      select="100"/>
      <xsl:with-param name="pStart" 
                      select="$vStart"/>
      <xsl:with-param name="pEnd" 
                      select="$vEnd"/>
    </xsl:call-template>
    </xsl:variable>
    
    <xsl:variable name="vScaledRandoms" 
         select="vendor:node-set($vrtfScaledRandoms)/*"/>

    <xsl:for-each select="$vScaledRandoms
                                [position() mod 10 = 1]">
      <xsl:value-of select="$NL"/>
      <xsl:for-each select=". 
                      | following::*[position() &lt; 10]">
        <xsl:value-of select="concat(., ', ')"/>
      </xsl:for-each>
    </xsl:for-each>
    
  </xsl:template>
  
</xsl:stylesheet>

When this transformation is performed (ignoring the source xml document) the result is:

100 Randoms in (0, 65535):

17489, 59134, 9327, 52468, 43805, 8378, 18395, 59344,

52777, 23478, 21895, 18924, 6517, 29682, 22899, 61256,

14593, 34158, 40863, 5092, 6349, 60458, 46091, 13248,

58585, 17446, 25271, 2780, 2341, 26978, 47011, 38200,

12721, 30686, 65231, 3796, 19069, 52122, 50235, 62384,

32649, 150, 54247, 65484, 15573, 62162, 14803, 12072,

11873, 48718, 16895, 48580, 16429, 48906, 30827, 10144,

40505, 37126, 43287, 10428, 46213, 4162, 57347, 48408,

12049, 22718, 26927, 8372, 63965, 50810, 53403, 53136,

16617, 62838, 57927, 34220, 28725, 49586, 43571, 16136,

13249, 18222, 29791, 14244, 30605, 57834, 52427, 60288,

26521, 11750, 32631, 5788, 28645, 1826, 39011, 46328,

15473, 35230, 25487, 660,

100 Randoms in (1, 10):

3, 10, 2, 9, 7, 2, 3, 10, 9, 4,

4, 3, 1, 5, 4, 10, 3, 6, 7, 1,

1, 10, 8, 3, 9, 3, 4, 1, 1, 5,

8, 6, 2, 5, 10, 1, 3, 8, 8, 10,

5, 1, 9, 10, 3, 10, 3, 2, 2, 8,

3, 8, 3, 8, 5, 2, 7, 6, 7, 2,

8, 1, 9, 8, 2, 4, 5, 2, 10, 8,

9, 9, 3, 10, 9, 6, 5, 8, 7, 3,

3, 3, 5, 3, 5, 9, 9, 10, 5, 2,

5, 1, 5, 1, 6, 8, 3, 6, 4, 1,

2. Generating a random sequence following a distribution. Subsequences.

Suppose we want to model an event, which has N distinct outcomes. Sometimes we want that each outcome must have a predefined probability. First we define a Distribution type, which is a list of pairs of outcome object and outcome object's probability. In order for this to make sense, the objects in the list must all be distinct, and their probabilities must sum up to 1.

type Distribution a = [(a, Float)]

Suppose we need to model random outcomes 1 to 6 each with its given probability:

dist1 :: Distribution Int

dist1 = [(1,0.2), (2,0.25), (3,0.25), (4,0.15), (5,0.1), (6, 0.05)]

We will use a function, which given a distribution produces another function, that maps the interval [0,65535] into the set of distinct objects according to their probability. The idea is to divide the interval [0,65535] into N intervals (where N = length(dist) is the number of distinct objects), but in such a way, that the lengths of these intervals are proportional to the probabilities of the corresponding objects.

makeFunction :: Distribution a -> (Float -> a)

makeFunction dist = makeFun dist 0.0

makeFun ((ob, p) : dist) nLast rand

| nNext >= rand && rand > nLast

= ob

| otherwise

= makeFun dist nNext rand

where

nNext = p*fromInt modulus + nLast

Then the transformation of a list of random numbers into a list of (weighted) random numbers according to a distribution dist is given by the expression:

map (makeFunction dist)

Here's the XSLT implementation.

dist-randomSequence:

<!--
      Template: dist-randomSequence                        
      Purpose : Return the value of the next random number 
                from the current                           
    Parameters:                                            
      $pLength  - [optional] the length of the sequence    
                   of randoms that must be produced        
      $pDist    - a list of distributions                  
                 (outcome,probability pairs)               
                  e.g.:                                    
                    <myDistribution:myDistribution>        
                      <o>1</o><p>0.2</p>                   
                      <o>2</o><p>0.25</p>                  
                      <o>3</o><p>0.25</p>                  
                      <o>4</o><p>0.15</p>                  
                      <o>5</o><p>0.1</p>                   
                      <o>6</o><p>0.05</p>                  
                    </myDistribution:myDistribution>       
                                                           
    $pSeed      - [optional] the seed for the randomization
  =========================================================-->
  <xsl:template name="dist-randomSequence">
    <xsl:param name="pLength" select="1"/>
    <xsl:param name="pDist" select="/.."/>
    <xsl:param name="pSeed" select="$seed"/>
    
    <xsl:variable name="vrtfNormalRandomSeq">
      <xsl:call-template name="randomSequence">
        <xsl:with-param name="pLength" select="$pLength"/>
        <xsl:with-param name="pSeed" select="$pSeed"/>
      </xsl:call-template>
    </xsl:variable>

    <xsl:variable name="vmakeFun" 
                  select="$x:st/mySingleRandDistFun:*[1]"/>
    
    <!-- build makeFun dist 0.0 -->
    <xsl:variable name="vrtfDistFunction">
      <xsl:call-template name="curry">
        <xsl:with-param name="pFun" select="$vmakeFun"/>
        <xsl:with-param name="pNargs" select="3"/>
        <xsl:with-param name="arg2" select="$pDist"/>
        <xsl:with-param name="arg3" select="'0.0'"/>
      </xsl:call-template>
    </xsl:variable>
    
    <xsl:call-template name="map">
      <xsl:with-param name="pFun" 
           select="vendor:node-set($vrtfDistFunction)/*"/>
      <xsl:with-param name="pList1" 
        select="vendor:node-set($vrtfNormalRandomSeq)/*"/>
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="makeFun" match="mySingleRandDistFun:*">
    <xsl:param name="arg1"/>              <!-- rand -->
    <xsl:param name="arg2" 
               select="/.."/>      <!-- Distribution -->
    <xsl:param name="arg3"/>              <!-- nLast -->
    
    <xsl:variable name="vP" select="$arg2[2]"/>
    <xsl:variable name="vOutcome" select="$arg2[1]"/>
    <xsl:variable name="vnNext" 
                  select="$vP * $modulus + $arg3"/>

    <xsl:choose>
      <!--                                                 
        >   | nNext >= rand && rand > nLast                
        >         =  ob                                    
      -->
      <xsl:when test="$vnNext >= $arg1 and $arg1 > $arg3">
        <xsl:copy-of select="$vOutcome/node()"/>
      </xsl:when>
      <!--
        >   | otherwise                                    
        >         = makeFun dist nNext rand                
      -->
      <xsl:otherwise>
        <xsl:call-template name="makeFun">
          <xsl:with-param name="arg1" select="$arg1"/>
          <xsl:with-param name="arg2" 
                          select="$arg2[position() > 2]"/>
          <xsl:with-param name="arg3" select="$vnNext"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>    
  </xsl:template>

Finally, someone will rightfully ask why our random sequences are always the same, starting with exactly the same numbers. A workaround for this issue is to be able to pick up a subsequence of the random sequence, and define the starting element of the subsequence in some non-deterministic way (e.g. use an extension function to get the current time and use the value of the seconds as index for the start of the subsequence).

The implementation is straightforward:

randomSubSequence, dist-randomSubSequence:

<!--
      Template: randomSubSequence                          
      Purpose : Return a sub-sequence of random numbers    
                starting from a given index                
    Parameters:                                            
     $pLength   - [optional] the length of the sequence    
                   of randoms that must be produced        
     $pSubStart - [optional] the index at which            
                   the sub-sequence is to start            
     $pSeed     - [optional] the seed for the randomization
     $pStart    - [optional] the start of the interval,    
                  in which randoms are to be produced      
     $pEnd      - [optional] the end of the interval,      
                  in which randoms are to be produced      
  =========================================================-->
  <xsl:template name="randomSubSequence">
    <xsl:param name="pLength" select="1"/>
    <xsl:param name="pSubStart" select="1"/>
    <xsl:param name="pSeed" select="$seed"/>
    <xsl:param name="pStart" select="0"/>
    <xsl:param name="pEnd" select="$modulus - 1"/>
    
    <xsl:variable name="vrtfInitSequence">
      <xsl:call-template name="randomSequence">
        <xsl:with-param name="pLength" select="$pSubStart"/>
        <xsl:with-param name="pSeed" select="$pSeed"/>
      </xsl:call-template>
    </xsl:variable>
    
    <xsl:call-template name="randomSequence">
        <xsl:with-param name="pLength" select="$pLength"/>
        <xsl:with-param name="pSeed" 
           select="vendor:node-set($vrtfInitSequence)
                                              /*[last()]"/>
        <xsl:with-param name="pStart" select="$pStart"/>
        <xsl:with-param name="pEnd" select="$pEnd"/>
    </xsl:call-template>
  </xsl:template>
  
<!--
      Template: dist-randomSubSequence                     
      Purpose : Return a sub-sequence of random numbers    
                starting from a given index, and according 
                to a given distribution                    
    Parameters:                                            
     $pLength   - [optional] the length of the sequence    
                   of randoms that must be produced        
     $pSubStart - The index at which the sub-sequence      
                  is to start                              
     $pDist     - a list of distributions                  
                  (outcome,probability pairs)              
                  e.g.:                                    
                    <myDistribution:myDistribution>        
                      <o>1</o><p>0.2</p>                   
                      <o>2</o><p>0.25</p>                  
                      <o>3</o><p>0.25</p>                  
                      <o>4</o><p>0.15</p>                  
                      <o>5</o><p>0.1</p>                   
                      <o>6</o><p>0.05</p>                  
                    </myDistribution:myDistribution>       
                                                           
     $pSeed     - [optional] the seed for the randomization
  =========================================================-->
  <xsl:template name="dist-randomSubSequence">
    <xsl:param name="pLength" select="1"/>
    <xsl:param name="pSubStart" select="1"/>
    <xsl:param name="pDist" select="/.."/>
    <xsl:param name="pSeed" select="$seed"/>

    <xsl:variable name="vrtfInitSequence">
      <xsl:call-template name="randomSequence">
        <xsl:with-param name="pLength" select="$pSubStart"/>
        <xsl:with-param name="pSeed" select="$pSeed"/>
      </xsl:call-template>
    </xsl:variable>
 
    <xsl:call-template name="dist-randomSequence">
        <xsl:with-param name="pLength" select="$pLength"/>
        <xsl:with-param name="pSeed" 
           select="vendor:node-set($vrtfInitSequence)
                                              /*[last()]"/>
        <xsl:with-param name="pDist" select="$pDist"/>
    </xsl:call-template>
  </xsl:template>

Let's now test all functions in the Random.xsl module.

test-random.xsl:

<xsl:stylesheet version="1.0" 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:vendor="urn:schemas-microsoft-com:xslt"
 xmlns:myDistribution="f:myDistribution"
 exclude-result-prefixes="xsl vendor myDistribution"
>

  <xsl:import href="random.xsl"/>
  
  <myDistribution:myDistribution>
    <o>1</o><p>0.2</p>
    <o>2</o><p>0.25</p>
    <o>3</o><p>0.25</p>
    <o>4</o><p>0.15</p>
    <o>5</o><p>0.1</p>
    <o>6</o><p>0.05</p>
  </myDistribution:myDistribution>
  
  <xsl:output indent="yes" omit-xml-declaration="yes"/>
  
  <xsl:variable name="NL" select="'&#xA;'"/>
  
  <xsl:template match="/">
    
    <xsl:variable name="vrtfUnscaledRandoms">
    <xsl:call-template name="randomSequence">
      <xsl:with-param name="pLength" select="100"/>
    </xsl:call-template>
    </xsl:variable>
    
    <xsl:variable name="vUnscaledRandoms" 
     select="vendor:node-set($vrtfUnscaledRandoms)/*"/>
    
100 Randoms in (0,<xsl:value-of select="$modulus - 1"/>):
    <xsl:for-each select="$vUnscaledRandoms
                             [position() mod 8 = 1]">
      <xsl:value-of select="$NL"/>
      <xsl:for-each 
         select=". | following::*[position() &lt; 8]">
        <xsl:value-of select="concat(., ', ')"/>
      </xsl:for-each>
    </xsl:for-each>
    
    <xsl:variable name="vStart" select="1"/>
    <xsl:variable name="vEnd" select="10"/>
    
    <xsl:value-of select="concat($NL,$NL,
                                 '100 Randoms in (',
                                 $vStart, ', ',
                                 $vEnd, '):',
                                 $NL
                                 )"/> 
    <xsl:variable name="vrtfScaledRandoms">
    <xsl:call-template name="randomSequence">
      <xsl:with-param name="pLength" select="100"/>
      <xsl:with-param name="pStart" select="$vStart"/>
      <xsl:with-param name="pEnd" select="$vEnd"/>
    </xsl:call-template>
    </xsl:variable>
    
    <xsl:variable name="vScaledRandoms" 
       select="vendor:node-set($vrtfScaledRandoms)/*"/>
    
    <xsl:for-each select="$vScaledRandoms
                              [position() mod 10 = 1]">
      <xsl:value-of select="$NL"/>
      <xsl:for-each 
         select=". | following::*[position() &lt; 10]">
        <xsl:value-of select="concat(., ', ')"/>
      </xsl:for-each>
    </xsl:for-each>
    
    <xsl:value-of select="concat($NL,$NL,
                                 'Randoms 11-20 in (',
                                 $vStart, ', ',
                                 $vEnd, '):',
                                 $NL
                                 )"/> 
    <xsl:variable name="vrtfRandSubsequence">
      <xsl:call-template name="randomSubSequence">
        <xsl:with-param name="pLength" select="10"/>
        <xsl:with-param name="pSubStart" select="11"/>
        <xsl:with-param name="pStart" select="$vStart"/>
        <xsl:with-param name="pEnd" select="$vEnd"/>
      </xsl:call-template>
    </xsl:variable>
    
    <xsl:copy-of 
      select="vendor:node-set($vrtfRandSubsequence)/*"/>
    
    <xsl:variable name="vrtfDistRandSeq">
      <xsl:call-template name="dist-randomSequence">
        <xsl:with-param name="pLength" select="100"/>
        <xsl:with-param name="pDist" 
         select="document('')/*/myDistribution:*[1]/*"/>
        
      </xsl:call-template>
    </xsl:variable>
    
    <xsl:value-of select="concat($NL,$NL,
      '100 Random outcomes 1-6 with distribution:&#xA;'
                                 )"/> 
    <xsl:copy-of 
         select="document('')/*/myDistribution:*[1]"/>
    <xsl:value-of 
          select="concat($NL, $NL, 'Results:', $NL)"/>
    
    <xsl:variable name="vDistRandSeq" 
         select="vendor:node-set($vrtfDistRandSeq)/*"/>
    
    <xsl:for-each select="$vDistRandSeq
                              [position() mod 10 = 1]">
      <xsl:value-of select="$NL"/>
      <xsl:for-each 
         select=". | following::*[position() &lt; 10]">
        <xsl:value-of select="concat(., ', ')"/>
      </xsl:for-each>
    </xsl:for-each>
    
        <xsl:value-of select="concat($NL,$NL,
           'Randoms 11-20 with the same distribution:',
                                 $NL
                                 )"/> 
    <xsl:variable name="vrtfDistRandSubsequence">
      <xsl:call-template name="dist-randomSubSequence">
        <xsl:with-param name="pLength" select="10"/>
        <xsl:with-param name="pSubStart" select="11"/>
        <xsl:with-param name="pDist" 
         select="document('')/*/myDistribution:*[1]/*"/>
      </xsl:call-template>
    </xsl:variable>
    <xsl:copy-of 
    select="vendor:node-set($vrtfDistRandSubsequence)/*"/>
  </xsl:template>
</xsl:stylesheet>

When applied on any xml source document, the above transformation produces the following results:

100 Randoms in (0,65535):
    
17489, 59134, 9327, 52468, 43805, 8378, 18395, 59344, 
52777, 23478, 21895, 18924, 6517, 29682, 22899, 61256, 
14593, 34158, 40863, 5092, 6349, 60458, 46091, 13248, 
58585, 17446, 25271, 2780, 2341, 26978, 47011, 38200, 
12721, 30686, 65231, 3796, 19069, 52122, 50235, 62384, 
32649, 150, 54247, 65484, 15573, 62162, 14803, 12072, 
11873, 48718, 16895, 48580, 16429, 48906, 30827, 10144, 
40505, 37126, 43287, 10428, 46213, 4162, 57347, 48408, 
12049, 22718, 26927, 8372, 63965, 50810, 53403, 53136, 
16617, 62838, 57927, 34220, 28725, 49586, 43571, 16136, 
13249, 18222, 29791, 14244, 30605, 57834, 52427, 60288, 
26521, 11750, 32631, 5788, 28645, 1826, 39011, 46328, 
15473, 35230, 25487, 660, 

100 Randoms in (1, 10):

3, 10, 2, 9, 7, 2, 3, 10, 9, 4, 
4, 3, 1, 5, 4, 10, 3, 6, 7, 1, 
1, 10, 8, 3, 9, 3, 4, 1, 1, 5, 
8, 6, 2, 5, 10, 1, 3, 8, 8, 10, 
5, 1, 9, 10, 3, 10, 3, 2, 2, 8, 
3, 8, 3, 8, 5, 2, 7, 6, 7, 2, 
8, 1, 9, 8, 2, 4, 5, 2, 10, 8, 
9, 9, 3, 10, 9, 6, 5, 8, 7, 3, 
3, 3, 5, 3, 5, 9, 9, 10, 5, 2, 
5, 1, 5, 1, 6, 8, 3, 6, 4, 1, 

Randoms 11-20 in (1, 10):
<el>4</el>
<el>3</el>
<el>1</el>
<el>5</el>
<el>4</el>
<el>10</el>
<el>3</el>
<el>6</el>
<el>7</el>
<el>1</el>

100 Random outcomes 1-6 with distribution:
<myDistribution:myDistribution>
<o>1</o><p>0.2</p>
<o>2</o><p>0.25</p>
<o>3</o><p>0.25</p>
<o>4</o><p>0.15</p>
<o>5</o><p>0.1</p>
<o>6</o><p>0.05</p>
</myDistribution:myDistribution>

Results:

2, 5, 1, 4, 3, 1, 2, 5, 4, 2, 
2, 2, 1, 3, 2, 5, 2, 3, 3, 1, 
1, 5, 4, 2, 5, 2, 2, 1, 1, 2, 
4, 3, 1, 3, 6, 1, 2, 4, 4, 6, 
3, 1, 4, 6, 2, 5, 2, 1, 1, 4, 
2, 4, 2, 4, 3, 1, 3, 3, 3, 1, 
4, 1, 5, 4, 1, 2, 2, 1, 6, 4, 
4, 4, 2, 6, 5, 3, 2, 4, 3, 2, 
2, 2, 3, 2, 3, 5, 4, 5, 2, 1, 
3, 1, 2, 1, 3, 4, 2, 3, 2, 1, 

Randoms 11-20 with the same distribution:
<el>2</el>
<el>2</el>
<el>1</el>
<el>3</el>
<el>2</el>
<el>5</el>
<el>2</el>
<el>3</el>
<el>3</el>
<el>1</el>

3. Testing randomness with Monte Carlo integration

One way to test the "randomness" of a given random generator is to use it in Monte Carlo integration to calculate the (known) value of an integral. We implement the simplest possible case of Monte-Carlo integration, in which

f(x) > 0 for x in [x_a, x_b]

and the absolute maximum value of f(x) in this interval is known. This allows us to specify an interval in which f(x) will vary in [0, y_max].

The idea of this method of integration is simple: we generate "random points" and count how many of them are bellow the graphic of f(x), and how many - above. In case the "random points" are really uniformly dispersed over the area, then the ratio of points-bellow to total-points should be the same as the ratio of the space covered by the graphic of f(x) and of the straight line g(x) = y_max. Then:

b points-bellow
∫f(x)dx = (y_max - y_min) ----------------
a total-points

Warning: This method converges notoriously slow (sqrt(1/N)) and is not recommended for use other than testing!

In order to implement Monte Carlo integration, we need just one additional function that would scale a random number into a floating-point number, belonging to a given interval. The scaling transformation is defined as follows:

X_sc = a*X + b

Where

a = (X_b - X_a) / (modulus - 1)

b = X_a

Y_sc = c*Y + d

c = (Y_max - Y_min) / (modulus - 1)

d = Y_min

Bellow is the XSLT implementation of simple Monte-Carlo integration, also presented is the generation of random floating point numbers.

<xsl:stylesheet version="1.0" 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:vendor="urn:schemas-microsoft-com:xslt"
 xmlns:x="f:fxsl-monteCarlo"
 xmlns:genTest="f:genTest" 
 xmlns:rFloat="f:rFloat"
 exclude-result-prefixes="xsl vendor x 
                          genTest rFloat"
 >
 
  <xsl:import href="random.xsl"/>
  <xsl:import href="iter.xsl"/>
  
  <!-- Fun: GenerateAndTest -->
  <genTest:genTest/>  
  <!-- Fun: rFloat -->
  <rFloat:rFloat/>    
  
  <xsl:variable name="x:st" 
                select="document('')/*"/>
  
  <xsl:template name="monteCarlo">
    <!-- n (Iterations) -->
    <xsl:param name="arg1" select="1"/>
    <!-- f (fn to be integrated) -->
    <xsl:param name="arg2" select="/.."/>   
    <!-- sx (start of interval for x) -->
    <xsl:param name="arg3"/>                
    <!-- tx (end of interval for x) -->
    <xsl:param name="arg4"/>                
    <!-- sy (start of interval for y) -->
    <xsl:param name="arg5" select="'0'"/>   
    <!-- ty (end of interval for y) -->
    <xsl:param name="arg6"/>                
    <!-- starting seed -->
    <xsl:param name="arg7" select="$seed"/> 
    
    <xsl:variable name="vGenAndTest" 
                  select="$x:st/genTest:*[1]"/>
    <xsl:variable name="vRFloat" 
                  select="$x:st/rFloat:*[1]"/>
    
    <xsl:variable name="vIntRange" 
                  select="$modulus - 1"/>
    
<!-- rndFloatX = rFloat((tx - sx)/dintRange) sx -->
    <xsl:variable name="vrtf-rndFloatX">
      <xsl:call-template name="curry">
        <xsl:with-param name="pNargs" 
                        select="3"/>
        <xsl:with-param name="pFun" 
                        select="$vRFloat"/>
        <xsl:with-param name="arg1" 
             select="($arg4 - $arg3) 
                    div $vIntRange"/>
        <xsl:with-param name="arg2" 
                        select="$arg3"/>
      </xsl:call-template>
    </xsl:variable>
    
<!-- rndFloatY = rFloat((ty - sy)
                    / dintRange) sy -->
    <xsl:variable name="vrtf-rndFloatY">
      <xsl:call-template name="curry">
        <xsl:with-param name="pNargs" 
                        select="3"/>
        <xsl:with-param name="pFun" 
                        select="$vRFloat"/>
        <xsl:with-param name="arg1" 
             select="($arg6 - $arg5) 
                    div $vIntRange"/>
        <xsl:with-param name="arg2" 
                        select="$arg5"/>
      </xsl:call-template>
    </xsl:variable>
    
    <xsl:variable name="vrtfGTParams">
     <cnt>0</cnt>
     <seed>
       <xsl:value-of select="$arg7"/>
       </seed>
    </xsl:variable>
    
    <xsl:variable name="vrtf-GenerateAndTest">
      <xsl:call-template name="curry">
        <xsl:with-param name="pNargs" 
                        select="4"/>
        <xsl:with-param name="pFun" 
                        select="$vGenAndTest"/>
        <xsl:with-param name="arg2" 
         select="vendor:node-set($vrtf-rndFloatX)/*"/>
        <xsl:with-param name="arg3" 
         select="vendor:node-set($vrtf-rndFloatY)/*"/>
        <!-- f --> 
        <xsl:with-param name="arg4" 
                        select="$arg2"/> 
      </xsl:call-template>
    </xsl:variable>
    
    <xsl:variable name="vResultIterations">
      <xsl:call-template name="iter">
        <xsl:with-param name="pTimes" 
                        select="$arg1"/>
        <xsl:with-param name="pFun" 
   select="vendor:node-set($vrtf-GenerateAndTest)/*"/>
        <xsl:with-param name="pX" 
         select="vendor:node-set($vrtfGTParams)/*"/>
      </xsl:call-template>
    </xsl:variable>
    
    <xsl:variable name="vnHits" 
      select="vendor:node-set($vResultIterations)/*[1]"/>
    
    <xsl:variable name="vSpace" 
         select="($arg4 - $arg3) * ($arg6 - $arg5)"/>
    
    <xsl:value-of 
         select="$vnHits div $arg1 * $vSpace"/>
  </xsl:template>
  
  <xsl:template match="genTest:*">
    <!-- (ct, sd) -->
    <xsl:param name="arg1" select="/.."/> 
    <!-- rndFloatX -->
    <xsl:param name="arg2" select="/.."/> 
    <!-- rndFloatY -->
    <xsl:param name="arg3" select="/.."/> 
    <!-- f -->
    <xsl:param name="arg4" select="/.."/> 
    
    <xsl:variable name="vCount" 
                  select="$arg1[1]"/>
    <xsl:variable name="vSeed"  
                  select="$arg1[2]"/>
    
    <xsl:variable name="vNextSeed">
      <xsl:call-template name="randNext">
        <xsl:with-param name="arg1" 
                        select="$vSeed"/>
      </xsl:call-template>
    </xsl:variable>
    
    <xsl:variable name="v-rndX">
      <xsl:apply-templates select="$arg2">
        <xsl:with-param name="arg3" 
                        select="$vSeed"/>
      </xsl:apply-templates> 
    </xsl:variable>
    
    <xsl:variable name="vF-rndX">
      <xsl:apply-templates select="$arg4">
        <xsl:with-param name="arg1" 
                        select="$v-rndX"/>
      </xsl:apply-templates>
    </xsl:variable>
    
    <xsl:variable name="v-rndY">
      <xsl:apply-templates select="$arg3">
        <xsl:with-param name="arg3" 
                        select="$vNextSeed"/>
      </xsl:apply-templates> 
    </xsl:variable>
    
    <xsl:choose>
      <xsl:when test="$vF-rndX >= $v-rndY">
        <cnt>
         <xsl:value-of select="$vCount + 1"/>
        </cnt>
      </xsl:when>
      <xsl:otherwise>
        <cnt>
         <xsl:value-of select="$vCount"/>
        </cnt>
      </xsl:otherwise>
    </xsl:choose>
        <seed>
         <xsl:value-of select="$vNextSeed"/>
        </seed>
    
  </xsl:template>
  
  <!-- computes a*sd + b -->
  <xsl:template match="rFloat:*"> 
    <xsl:param name="arg1"/> <!-- a -->
    <xsl:param name="arg2"/> <!-- b -->
    <xsl:param name="arg3"/> <!-- sd -->
    
    <xsl:value-of select="$arg1*$arg3 + $arg2"/>
    
  </xsl:template>
</xsl:stylesheet>

And here's is a test with 65536 random points (as you have been warned, don't attempt to run this on a slow machine!) and the results:

test-monteCarlo.xsl

<xsl:stylesheet version="1.0" 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:x="f:fxsl-test-monteCarlo"
 xmlns:mySampleFun1="f:mySampleFun1"
 xmlns:mySampleFun2="f:mySampleFun2"
 xmlns:mySampleFun3="f:mySampleFun3"
 exclude-result-prefixes="xsl mySampleFun1"
 >
 
  <xsl:import href="monteCarlo.xsl"/>
  
  <xsl:output method="text"/>
  
  <mySampleFun1:mySampleFun1/>
  <mySampleFun2:mySampleFun2/>
  <mySampleFun3:mySampleFun3/>

  <xsl:variable name="x:st" 
                select="document('')/*"/>

  <xsl:template match="/">
    <xsl:variable name="vSampleFun1" 
         select="$x:st/mySampleFun1:*[1]"/>
    <xsl:variable name="vSampleFun2" 
         select="$x:st/mySampleFun2:*[1]"/>
    <xsl:variable name="vSampleFun3" 
         select="$x:st/mySampleFun3:*[1]"/>
    
    <xsl:call-template name="monteCarlo">
      <xsl:with-param name="arg1" 
                      select="65536"/>
      <xsl:with-param name="arg2" 
                select="$vSampleFun1"/>
      <xsl:with-param name="arg3" 
                      select="'0'"/> <!-- sx -->
      <xsl:with-param name="arg4" 
                      select="1"/>   <!-- tx -->
      <xsl:with-param name="arg5" 
                      select="'0'"/> <!-- sy -->
      <xsl:with-param name="arg6" 
                      select="4"/>   <!-- ty -->
    </xsl:call-template>
    <xsl:text>&#xA;</xsl:text>
    <xsl:call-template name="monteCarlo">
      <xsl:with-param name="arg1" 
                      select="65536"/>
      <xsl:with-param name="arg2" 
                select="$vSampleFun2"/>
      <xsl:with-param name="arg3" 
                      select="'0'"/> <!-- sx -->
      <xsl:with-param name="arg4" 
                      select="1"/>   <!-- tx -->
      <xsl:with-param name="arg5" 
                      select="'0'"/> <!-- sy -->
      <xsl:with-param name="arg6" 
                      select="1"/>   <!-- ty -->
    </xsl:call-template>
    <xsl:text>&#xA;</xsl:text>
    <xsl:call-template name="monteCarlo">
      <xsl:with-param name="arg1" 
                      select="65536"/>
      <xsl:with-param name="arg2" 
                select="$vSampleFun3"/>
      <xsl:with-param name="arg3" 
                      select="1"/> <!-- sx -->
      <xsl:with-param name="arg4" 
                      select="2"/>   <!-- tx -->
      <xsl:with-param name="arg5" 
                      select="'0'"/> <!-- sy -->
      <xsl:with-param name="arg6" 
                      select="1"/>   <!-- ty -->
    </xsl:call-template>
  </xsl:template>
  
  <!-- f x = 4 / (1 + x^2) -->
  <xsl:template match="mySampleFun1:*"> 
    <xsl:param name="arg1"/>
    
    <xsl:value-of select="4 
                       div (1 + $arg1*$arg1)"/>
  </xsl:template>

  <!-- f x = x -->
  <xsl:template match="mySampleFun2:*"> 
    <xsl:param name="arg1"/>
    
    <xsl:value-of select="$arg1"/>
  </xsl:template>

  <!-- f x = 1 / x -->
  <xsl:template match="mySampleFun3:*"> 
    <xsl:param name="arg1"/>
    
    <xsl:value-of select="1 div $arg1"/>
  </xsl:template>
</xsl:stylesheet>

And the results are:

3.1414794921875

0.4999847412109375

0.693206787109375

As the actual values of the three integrals are well-known and must be respectively: pi, ½ and ln(2), we can see that our random generator does a pretty good job.

Conclusion

This article proves once again that it is easy to implement even difficult and scary algorithms using powerful functional programming functions and design patterns. The FXSL functional programming library for XSLT makes this possible. As a beneficial side effect, a new, useful module -- Random.xsl --has been added to FXSL. It will serve the needs of XSLT 1.0 and XSLT 2.0 users.

Resources

[1] The Functional Programming Language XSLT - A proof through examples

By D.Novatchev

The first in a series of articles, demonstrating the implementation of functional programming in XSLT and its practical use.

[2] Dynamic Functions using FXSL: Composition, Partial Applications and Lambda Expressions

By D.Novatchev

The second article in the "functional programming in XSLT series", explaining and demonstrating the implementation of some of the most powerful design patterns of functional programming.

[3] The FXSL functional programming library

A download from the FXSL SourceForge.Net project. Contains the source code of the functions from [1], [2] and this article, and much more. FXSL editions exist for: Saxon, MSXML and Xalan (adapted by Knut Wannheden).

[4] Haskell: The Craft of Functional Programming

By Simon Thompson,

Second Edition, Addison-Wesley, 507 pages, paperback, 1999. ISBN 0-201-34275-8.

This book is essential reading for beginners to functional programming and newcomers to the Haskell programming language. The emphasis is on the process of crafting programs and the text contains many examples and running case studies, as well as advice and program design, testing, problem solving and how to avoid common pitfalls.

[5] XML Path Language (XPath), Version 1.0, W3C Recommendation 16 November 1999

[6] XQuery 1.0 and XPath 2.0 Functions and Operators, World Wide Web Consortium. XQuery 1.0 and XPath 2.0 Functions and Operators W3C Working Draft, 20 December 2001.